Visualising Data

Leighton Pritchard

6 September 2016

IMAGINE…

  • YOUR FIGURES ARE AMAZING!
  • BUT MISLEADING

A bar chart

Two effectors

  • Knocked out independently
  • Host chlorosis measured

Communication

  • Stories told through figures

Scales matter

  • Indication of quantities

Context matters

  • Figures are what you remember of a story

Figures can mislead

Binary thinking

  • The same or not the same
  • Larger or smaller
  • What about uncertainty?

Another Bar Chart

Four effectors

  • Bacterial effectors
  • Inoculate wild-type plants
  • Measure growth (CFU)

Four bar plots

  • Do the effectors have the same effect?

Add error bars

  • Do the effectors have the same effect?

Error bars

Error bars

  • Estimates of uncertainty
  • But uncertainty of what?
  • standard deviation (sd):
    • describes the data: how much members of the group differ from the mean
  • standard error (of the mean) (sem):
    • describes the estimate of the mean: standard deviation of the estimate of the mean

SD or SEM?

  • Which was used (& which would you want to know)?

Raw data

Raw data

  • Are they the same responses?

What does mean mean?

  • Does having the same mean imply having the same response?

What does mean mean?

  • Unequal sample sizes

What does mean mean?

  • Outliers

What does mean mean?

  • Bimodal distribution

But stats, right?

We use figures as guides…

  • “Figures tell a story, but we actually only believe the stats”
  • P<0.05, t-test (NHST), a description if you’re lucky
  • Do the distributions support use of NHST or t-test (are the data Normal)?

…we trust the P-values

  • Bar plots hide inappropriate assumptions

Source: Weissgerber et al. (2015)

Figures can mislead

  • reinforce poor practice
    • binary thinking
    • overlooking data distributions and wrong statistical assumptions for tests
    • overlooking uncertainty
  • suggest neat stories (P<0.05)
    • data, like life, can be messy

Ways forward

Now what?

  • “Thanks for undermining me. Now what do I do about it?”
  • Other data representations are available
  • Data visualisation/statistics training courses
    • Research Data Visualisation Workshops
    • Data Carpentry
    • Software Carpentry

Anscombe’s Quartet

  • Four datasets: same means and standard deviations

Boxplots

  • Median, interquartiles, outliers

Raw data

  • 1D scatterplots

Box and raw data

  • Boxplots and jittered 1D scatterplots

Violin plot

  • Data density estimate

Violin and raw data

  • Stacked, not jittered, data